Bootstrapping Fine-Grained Classifiers: Active Learning with a Crowd in the Loop

نویسندگان

  • Genevieve Patterson
  • Grant Van Horn
  • Serge Belongie
  • Pietro Perona
  • James Hays
چکیده

We propose an iterative crowd-enabled active learning algorithm for building high-precision visual classifiers from unlabeled images. Our method employs domain experts to identify a small number of examples of a specific visual event. These expert-labeled examples seed a classifier, which is then iteratively trained by active querying of a non-expert crowd. These non-experts actively refine the classifiers at every iteration by answering simple binary questions about the classifiers’ detections. The advantage of this approach is that experts efficiently shepherd an unsophisticated crowd into training a classifier capable of fine-grained distinctions. This obviates the need to label an entire dataset to obtain high-precision classifiers. We find these classifiers are advantageous for creating a large vocabulary of visual attributes for specialized taxonomies. We demonstrate our crowd active learning pipeline by creating classifiers for attributes related to North American birds and fashion.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Crowdsourcing WordNet

This paper describes an experiment in using Amazon Mechanical Turk to collaboratively create a sense inventory. In a bootstrapping process with massive collaborative input, substitutions for target words in context are elicited and clustered by sense; then more contexts are collected. Contexts that cannot be assigned to a current target word’s sense inventory re-enter the loop and get a supply ...

متن کامل

Heuristic Methods for Reducing Errors of Geographic Named Entities Learned by Bootstrapping

One of issues in the bootstrapping for named entity recognition is how to control annotation errors introduced at every iteration. In this paper, we present several heuristics for reducing such errors using external resources such as WordNet, encyclopedia and Web documents. The bootstrapping is applied for identifying and classifying fine-grained geographic named entities, which are useful for ...

متن کامل

Bootstrapping Coreference Classifiers with Multiple Machine Learning Algorithms

Successful application of multi-view cotraining algorithms relies on the ability to factor the available features into views that are compatible and uncorrelated. This can potentially preclude their use on problems such as coreference resolution that lack an obvious feature split. To bootstrap coreference classifiers, we propose and evaluate a single-view weakly supervised algorithm that relies...

متن کامل

Active Learning for Crowd-Sourced Databases

Crowd-sourcing has become a popular means of acquiring labeled data for many tasks where humans are more accurate than computers, such as image tagging, entity resolution, or sentiment analysis. However, due to the time and cost of human labor, solutions that solely rely on crowd-sourcing are often limited to small datasets (i.e., a few thousand items). This paper proposes algorithms for integr...

متن کامل

A Scalable Bootstrapping Framework for Auto-Annotation of Large Image Collections

Image annotation aims to assign semantic concepts to images based on their visual contents. It has received much attention recently as huge dynamic collections of images/videos become available on the Web. Most recent approaches employ supervised learning techniques, which have the limitation that a large set of labeled training samples is required for effective learning. This is both tedious a...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2013